library("readxl")
library("tidyverse")5 Hypothesis testing
5.1 Preparation
- Load packages:
- Load the data sets:
data <- read_xlsx("Paquot_Larsson_2020_data.xlsx")
data_vowels <- read.csv("Vowels_Apache.csv", sep = "\t")5.2 Hypothesis testing
The first step is to define the null hypothesis \(H_0\) and the alternative hypothesis \(H_1\) (or \(H_a\)).
Given two categorical variables \(X\) and \(Y\), we assume under \(H_0\) that both variables are independent from each other. This hypothesis describes the “default state of the world” (James et al. 2021: 555), i.e., what we would usually expect to see. By contrast, the alternative hypothesis \(H_1\) states that \(X\) and \(Y\) are not independent, i.e., that \(H_0\) does not hold.
In this unit, we will consider two scenarios:
- We are interested in finding out whether English clause
ORDER(‘sc-mc’ or ‘mc-sc’) depends on the type of the subordinate clause (SUBORDTYPE), which can be either temporal (‘temp’) or causal (‘caus’).
Our hypotheses are:
\(H_0:\) The variables
ORDERandSUBORDTYPEare independent.\(H_1:\) The variables
ORDERandSUBORDTYPEare not independent.
- As part of a phonetic study, we compare the base frequencies of the F1 formants for male and female speakers of Apache. We forward the following hypotheses:
\(H_0:\) mean
F1 frequencyof men \(=\) meanF1 frequencyof women.\(H_1:\) mean
F1 frequencyof men \(\ne\) meanF1 frequencyof women.
Based on our data, we can decide to either accept or reject \(H_0\). Rejecting \(H_0\) can be viewed as evidence in favour of \(H_1\) and thus marks a potential ‘discovery’ in the data. However, there is always a chance that we accept or reject the wrong hypothesis; the four possible constellations are summarised in the table below (cf. Heumann, Schomaker, and Shalabh 2022: 223):
| \(H_0\) is true | \(H_0\) is not true | |
|---|---|---|
| \(H_0\) is not rejected | \(\color{green}{\text{Correct decision}}\) | \(\color{red}{\text{Type II } (\beta)\text{-error}}\) |
| \(H_0\) is rejected | \(\color{red}{\text{Type I } (\alpha)\text{-error}}\) | \(\color{green}{\text{Correct decision}}\) |
The probability of a Type I error, which refers to the rejection of \(H_0\) although it is true, is called the significance level \(\alpha\), which has a conventional value of \(0.05\) (i.e., a 5% chance of committing a Type I error).
5.3 Constructing the critical region
An important question remains: How great should the difference be for us to reject \(H_0\)? The \(p\)-value measures the probability of encountering a specific value of a test statistic under the assumption that \(H_0\) holds. For example, a \(p\)-value of \(0.02\) means that we would see a particular \(\chi^2\)-score (or \(T\), \(F\) etc.) only 2% of the time if \(X\) and \(Y\) were unrelated (or if there was no difference between \(\bar{x}\) and \(\bar{y}\), respectively). Since our significance level \(\alpha\) is set to \(0.05\), we only reject the null hypothesis if this probability is lower than 5%.
We obtain \(p\)-values by consulting the probability density functions of the underlying distributions:
- Probability density function for the \(\chi^2\)-distribution with \(df = 1\)
Code
# Generate random samples from a chi-squared distribution with 1 degree of freedom
x <- rchisq(100000, df = 1)
# Create histogram
hist(x,
breaks = "Scott",
freq = FALSE,
xlim = c(0, 20),
ylim = c(0, 0.2),
ylab = "Probability density of observing a specific score",
xlab = "Chi-squared score",
main = "Histogram for a chi-squared distribution with 1 degree of freedom (df)",
cex.main = 0.9)
# Overlay PDF
curve(dchisq(x, df = 1), from = 0, to = 150, n = 5000, col = "orange", lwd = 2, add = TRUE)- Probability density function for the \(t\)-distribution with \(df = 112.19\)
Code
# Given t-statistic and degrees of freedom
t_statistic <- 2.4416
df <- 112.19
# Generate random samples from a t-distribution with the given degrees of freedom
x <- rt(100000, df = df)
# Create histogram
hist(x,
breaks = "Scott",
freq = FALSE,
xlim = c(-5, 5),
ylim = c(0, 0.4),
ylab = "Probability density of observing a specific score",
xlab = "t-score",
main = "Histogram for a t-distribution with 112.19 degrees of freedom",
cex.main = 0.9)
# Overlay PDF
curve(dt(x, df = df), from = -5, to = 5, n = 5000, col = "orange", lwd = 2, add = TRUE)